REVIEW 



A double-edged sword: R loops as threats 
to genome integrity and powerful 
regulators of gene expression 

Konstantina Skourti-Stathaki and Nicholas J. Proudfoot 

Sir William Dunn School of Pathology, University of Oxford, Oxford OX1 3RE, United Kingdom 



R loops are three-stranded nucleic acid structures that 
comprise nascent RNA hybridized with the DNA tem- 
plate, leaving the nontemplate DNA single-stranded. 
R loops form naturally during transcription even though 
their persistent formation can be a risky outcome with 
deleterious effects on genome integrity. On the other 
hand, over the last few years, an increasingly strong case 
has been built for R loops as potential regulators of gene 
expression. Therefore, understanding their function and 
regulation under these opposite situations is essential to 
fully characterize the mechanisms that control genome 
integrity and gene expression. Here we review recent 
findings about these interesting structures that highlight 
their opposite roles in cellular fitness. 



The R-loop structure was first characterized >38 years ago 
(Thomas et al. 1976), and the first demonstration that 
R loops exist in vivo came in 1995 with studies described 
by Crouch and colleagues (Drolet et al. 1995). They 
showed that R-loop formation occurs in a bacterial cell 
and is a consequence of the transcription process (Drolet 
et al. 1995). Since then and especially over the last decade, 
the R-loop field has become an increasingly expanded area 
of research, placing these structures as a potential re- 
gulator of gene expression but also as a major threat to 
genome stability. 

Transcription-mediated R-loop formation 

In general, where R loops have been described in vivo, the 
RNA strand is generated by RNA polymerase II (Pol II) 
transcribing a C-rich DNA template so that a G-rich 
transcript is generated. Interestingly, several studies have 
shown that R loops are formed preferentially when the 
nontemplate strand is G-rich (Reaban et al. 1994; Li and 
Manley 2005; Ginno et al. 2012, 2013). The increased 
thermodynamic stability of a G-rich RNA strand bound 
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to C-rich DNA could be a reason for this sequence spec- 
ificity (Sugimoto et al. 1995). However, it is as yet unclear 
how R loops are generated. According to the "extended 
RNA/DNA hybrid" model, the RNA/DNA hybrid duplex 
could be the result of an extension of the usual 8 -base-pair 
(bp) RNA/DNA hybrid (Westover et al. 2004) within the 
transcription bubble as Pol II elongates. This model, how- 
ever, is inconsistent with the crystallographic structure of 
Pol II that demonstrates the exit of DNA and RNA mol- 
ecules through different channels (Westover et al. 2004), 
strongly arguing against it (Aguilera and Garcia-Muse 
2012). A more plausible model suggests that the RNA/DNA 
hybrid could arise by threading back the RNA before the 
two strands of the DNA duplex reanneal (called the "thread 
back model"). According to extensive in vitro studies from 
the Lieber laboratory (Roy and Lieber 2009), R loops depend 
on three features: high G density, negative super coiling, 
and DNA nicks (Roy et al. 2010). Initial R-loop formation 
is favored by G clusters and DNA nicks downstream from 
the promoter on the nontemplate DNA strand, whereas 
subsequent RNA/DNA hybrid extension and stabilization 
are enhanced by high G density and negative super coiling 
(Aguilera and Garcia-Muse 2012). 

Once formed, R loops are particularly stable, as RNA/ 
DNA associations are thermodynamically more stable than 
DNA/DNA interactions (Roberts and Crothers 1992). This 
may be due to the structure of the RNA/DNA hybrid, which 
is thought to adopt a conformation that is an intermediate 
between the A form of a dsRNA and the B form of a DNA 
duplex (Shaw and Arya 2008). Another possibility is that 
a G quadruplex (G4) formed on the single- stranded exposed 
strand, such as in the case of the immunoglobulin (Ig) 
class switch region (Duquette et al. 2004), stabilizes the 
R-loop structure. 

R loops have been reported in vivo at prokaryotic origins 
of replication (Masukata and Tomizawa 1984, 1990; Baker 
and Romberg 1988; Lee and Clayton 1996; Carles-Kinch 
and Kreuzer 1997), the mitochondrial origin of replication 
(Xu and Clayton 1996), and the mammalian Ig class 
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switch region in activated B lymphocytes (Yu et al. 2003). 
In the latter case, R-loop formation is involved in facilitat- 
ing class switch recombination (CSR) that generates 
diverse antibody isotypes. 

R loops are not restricted to Pol II transcripts. Very highly 
transcribed Pol I rDNA repeats also form R loops (El Hage 
et al. 2010). Although Pol II transcripts were not thought 
to be associated with R loops, recent genomic analysis of 
R loops in Saccharomyces cerevisiae showed the presence 
of R loops over Pol III transcribed tRNA genes in various 
mutant backgrounds (Chan et al. 2014). This implies that 
Pol III transcripts can also form R loops in normal cells. In 
this review, we focus only on R loops formed in Pol II 
transcripts. 

R loops and genomic instability 

Transcription can be a "risky" process. R loops can lead 
to DNA damage by the exposure of ssDNA formed as 
a result of the RNA/DNA hybrizidation. Being more 
unstable, the ssDNA would then be susceptible to 
lesions and transcription-associated mutagenesis (TAM) or 
transcription-associated recombination (TAR) (see Fig. 1). 
However, the mechanism leading from an R loop to 
genomic instability still remains largely unknown. Possi- 
bly, the unpaired DNA strand resulting from R-loop 
formation is more susceptible to DNA damage such as 
spontaneous deamination of dC to dU, leading to double- 
strand breaks (DSBs) and recombination (Aguilera 2002; 
Li and Manley 2006; Aguilera and Garcia-Muse 2012). 
Thus, in one possible model, accumulation of R loops 
could make certain regions of the genome more prone to 
DNA-damaging agents by increasing the occurrence of 
single-stranded regions. In a second potential model, a pro- 
tein recognizing R-loop structures could be involved in 
initiating the generation of mutagenesis. One possible 
candidate is the activation-induced cytidine deaminase 
(AID). AID is an enzyme that promotes Ig heavy chain 
CSR and hypermutation in B lymphocytes. It functions 
by deaminating cy to sines into uracils on single-strand 
target DNA sequences (Muramatsu et al. 2000; Revy et al. 
2000). R loops forming behind elongating Pol II can 
provide the ssDNA substrate for this enzyme (Yu et al. 
2003). The generated U:G mismatch may then be repli- 
cated, creating two daughter species, one of which will 
undergo a C — > T transition mutation. However, dU 
could also be processed by base excision repair (BER) 
components such as uracil-DNA glycosylase and abasic 
endonuclease (APE). These enzymes remove the uracil, 
creating a DNA nick or leaving an abasic site (a site of 
base loss). Replication past the abasic site will result in 
random incorporation of any of the four nucleotides, 
possibly leading to further mutations. DNA nicks could 
also be converted to DNA DSBs that are recognized by 
the recombination-mediated repair machinery, ulti- 
mately leading to CSR for antibody genes (Di Noia 
and Neuberger 2002) or, more generally, DNA trans- 
locations. However, AID is specifically expressed in 
activated B cells and in chicken DT40 cells (which are 
B-cell-derived). This raises the question of how R-loop 
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Figure 1. R loops as a source of DNA damage. Nascent 
transcripts behind elongating Pol II can invade the DNA duplex 
and hybridize with the DNA template strand. The RNA/DNA 
hybrid so formed displaces the nontemplate strand, and this 
three- stranded structure constitutes an R loop. R loops can 
cause genomic instability in different ways. First, the displaced 
ssDNA can act as a substrate to DNA-damaging agents, de- 
aminases (AID), and repair enzymes (APE and BER), leading to 
DNA lesions and nicks. Second, G4 structures forming on the 
G-rich nontemplate strand can generate susceptible sites for 
nucleases. Finally, transcription elongation machinery impeded 
by stable R loops can cause replication-transcription collisions, 
leading to DNA recombination and DSBs. Points of contact 
between the DNA strand and nascent RNA indicate R-loop 
formation, whereas points of contact within the ssDNA indicate 
G4 structures. Pol II is shown as a blue icon, with an arrow indi- 
cating transcription direction. Nucleosomes are shown in green. 
The diagram is not drawn to scale. 

formation leads to genomic instability in cells that lack 
AID. DSBs have also been observed in HeLa cells upon 
depletion of SRSF1 (Li and Manley 2005), suggesting that 
other proteins could function analogously to AID in 
different cell types to initiate R-loop-induced genomic 
instability. Alternatively, spontaneous dC — ► dU muta- 
tion may occur at low levels. 

An additional possible scenario suggests that transcrip- 
tional R loops induce genomic instability by interfering 
with DNA replication (Aguilera 2002; Gan et al. 2011; 
Houlard et al. 2011; Aguilera and Garcia-Muse 2012). 
Thus, replication fork collisions with blocked Pol II have 
been shown to induce TAR or DNA breaks in budding 
yeast and mammals (Prado and Aguilera 2005; Gottipati 
et al. 2008; Boubakri et al. 2010). Unrepaired DNA lesions 
formed on the ssDNA of the R loop or the RNA/DNA 
hybrids themselves forming behind elongating Pol II 
may somehow restrict transcription, which in turn may 
block replication forks. Such blocked replication forks 
could then generate DNA lesions and DSBs in the newly 
synthesized DNA. These would induce recombination- 
mediated repair, which in turn could lead to chromo- 
some rearrangements and genomic instability (Aguilera 
and Garcia-Muse 2012). R loops are prevalent in meiosis, 
at least in S. cerevisiae and Caenorhabditis elegans, and 
their accumulation leads to replication impairment 
and genomic instability (Castellano-Pozo et al. 2012). 
In mammalian cells, replication-transcription collisions 
in long human genes (>800 kb) have been shown to be 
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associated with R-loop accumulation, which in turn 
causes instability at so-called common fragile sites (CFSs) 
(Helmrich et al. 2011). Instability at CFSs may also relate 
to slower or incomplete replication in areas of chromatin 
compaction (Debatisse et al. 2012). 

Additional factors leading from a transient R-loop struc- 
ture to deleterious genomic instability have only recently 
been identified. R-loop accumulation in S. cerevisiae, 
C. elegans, and human cells has been linked to histone 3 
SerlO phosphorylation (H3S10P), a mark of chromatin 
compaction (Castellano-Pozo et al. 2013). It is proposed 
that R loops trigger formation of the H3S10P mark, which 
in turn could cause replication fork stalling, transcription- 
replication collisions, and, ultimately, DSBs in the newly 
synthesized DNA (Castellano-Pozo et al. 2013). 

Surveillance mechanisms: What protects us from R 
loops? 

The deleterious effects of R loops formed during transcrip- 
tion on genome integrity have been variously documented 
(Aguilera 2002). Given that formation of R loops is an 
evolutionarily conserved mechanism (Li and Manley 2005; 
Aguilera and Garcia-Muse 2012), it is perhaps not surpris- 
ing that different organisms have used diverse mechanisms 
to protect their genomes (Li and Manley 2006; Aguilera 
and Garcia-Muse 2012). So far, five different mechanisms 
are thought to regulate R-loop formation (see Fig. 2): 
(1) RNase H enzyme, which specifically degrades the 
RNA in RNA/DNA hybrid (for review, see Cerritelli and 
Crouch 2009); (2) RNA/DNA helicases such as the yeast 
Senl or homologous human Senataxin (Mischo et al. 201 1; 
Skourti-Stathaki et al. 2011) and the human DHX9 heli- 
case, which also acts on G4 structures (Chakraborty and 
Grosse 2011); (3) topoisomerases, which relax DNA- 
negative supercoiling that otherwise causes persistent R- 
loop formation (Drolet et al. 1994, 1995; Tuduri et al. 
2009; El Hage et al. 2010; Yang et al. 2014); (4) proteins 
that prevent R-loop formation, such as mRNA biogenesis 
(Huertas and Aguilera 2003; Dominguez-Sanchez et al. 
201 1; Castellano-Pozo et al. 2012) and processing proteins 
(Li and Manley 2005; Paulsen et al. 2009; Wahba et al. 
2011; Stirling et al. 2012; Santos-Pereira et al. 2013); and 
(5) suppressors of proteins that promote R-loop formation 
(i.e., Rad51 and AtNDX) (Sun et al. 2013; Wahba et al. 
2013). 

Factors that prevent R-loop formation 

In S. cerevisiae, transcription-induced R loops were first 
documented with the characterization of the THO com- 
plex, suggested to be involved in transcriptional elongation 
(Aguilera 2002). This complex consists of four nuclear 
proteins (Hrpl, Tho2, Mftl, and Thp2) and is associated 
with the transcription-export complex (TREX) containing 
Texl and the mRNA export factors Sub2 and Yral (Chavez 
et al. 2000; Strasser et al. 2002). The physical interaction of 
the THO complex with TREX directly links mRNA pack- 
aging with RNA export. Mutations affecting THO/TREX 
have been shown to affect transcriptional elongation, proper 
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Figure 2. Diverse protection mechanisms against R-loop forma- 
tion. Two types of surveillance factors have been identified: factors 
that prevent formation of R loops and factors that actively remove 
them. DNA topoisomerase enzymes suppress R-loop formation by 
relaxing the negative supercoiling behind elongating Pol n. The 
THO complex (blue circle) facilitates efficient packaging of 
nascent RNA into messenger ribonucleotide proteins (mRNPs), 
preventing R-loop formation. Splicing and 3' end processing factors 
associate with nascent RNA and prevent R loops. RNA/DNA 
helicases and RNase H enzymes remove R loops once formed. 
DNA is shown as gray and black lines, and RNA is shown as a red 
line. Dotted lines indicate the site of action of different factors. 
The diagram is not drawn to scale. 

mRNA export, and recombination (Strasser et al. 2002; 
Rondon et al. 2003). A distinctive phenotype of yeast THO 
mutants (first identified in the HRP1 and TH02 genes) is 
their transcription-associated hyperrecombination pheno- 
type (Aguilera and Klein 1990; Piruat and Aguilera 1998), 
and this is directly associated with R-loop formation 
(Huertas and Aguilera 2003). 

The THO/TREX complex is responsible for packaging 
of pre-mRNA with RNA-binding proteins. Recently, an- 
other factor involved in mRNA export and processing, the 
yeast Npl3, also prevented R-loop induced transcription- 
replication collisions and genome instability (Santos-Pereira 
et al. 2013), suggesting a functional link between RNA 
metabolism and R-loop-associated genomic instability. As 
the nascent transcript emerges from elongating Pol II, it 
may have two immediate fates. It either is ^transcrip- 
tionally packaged into messenger ribonucleotide proteins 
(mRNPs) and exported through the nuclear pore or may 
invade the DNA duplex behind the elongating Pol II to 
form R-loop structures. R loops could then interfere with 
DNA replication, induce ssDNA breaks, or become recom- 
bination intermediates. Consequently, it could be argued 
that efficient mRNA packaging into mRNPs prevents 
R-loop formation and in turn restricts TAR and DNA 
damage (Huertas and Aguilera 2003; Moore and Proudfoot 
2009; Mischo et al. 2011; Aguilera and Garcia-Muse 2012; 
Santos-Pereira et al. 2013). However, it has been reported 
that even with normal mRNP biogenesis, accumulation 
of R loops can still occur (Mischo et al. 2011), suggesting 
that R loops form at a much higher frequency than once 
thought. 

DNA TOPI is an evolutionarily conserved factor that 
suppresses R-loop formation (Drolet et al. 1995; Tuduri 
et al. 2009; El Hage et al. 2010). This is possibly due to 
the ability of TOPI to relax negative DNA supercoiling. 
In the absence of TOPI, negative supercoils accumulate 
behind elongating Pol II to promote opening of DNA, 
which in turn facilitates annealing between the nascent 
RNA and DNA template strand with subsequent R-loop 
formation in bacteria (Drolet et al. 1995), S. cerevisiae 
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(El Hage et al. 2010), and human cells (Tuduri et al. 2009). 
Very recently, TOP3B, another member of this subfamily, 
was found to reduce negative supercoiling and R-loop 
formation (Yang et al. 2014). The interesting point here 
is that TOP3B is recruited to human and mouse gene 
loci by recognizing arginine methylation histone marks 
through its interaction with the methyl-arginine effector 
tudor domain-containing protein 3 (TDRD3). According 
to the proposed model, the TDRD3-TOP3B complex is 
recruited to regions of active transcription,- TDRD3 rec- 
ognizes the methyl-arginine histone marks and the meth- 
ylated C-terminal domain (CTD) of Pol II, whereas TOP3B 
resolves negative supercoiling that forms behind elongat- 
ing Pol II and by doing so restricts R-loop formation (Yang 
et al. 2014). Interestingly, in this study, formation of R 
loops is controlled by the presence of a chromatin mark 
as opposed to the alternative scenario suggested by the 
Aguilera laboratory (Castellano-Pozo et al. 2013), where 
accumulated R loops cause formation of H3S10P. In the 
future, it will be intriguing to test under which conditions 
R loops regulate chromatin structure (and consequently 
genome dynamics). Indeed, are they a cause or consequence 
of the epigenetic microenvironment? 

Genes of higher vertebrates are significantly longer, and 
the presence of introns is almost ubiquitous,- therefore, 
they have adapted some additional mechanisms to protect 
their genomes. The SRSF1, a serine-arginine-rich (SR) pro- 
tein that regulates the first steps of splicing, appears to 
interconnect pre-mRNA processing and genomic instabil- 
ity (Li and Manley 2005). Li and Manley (2005) demon- 
strated in chicken DT40 cells and human HeLa cells that 
depletion of SRSF1 causes the nascent transcript to form 
R loops, which in turn promote DNA rearrangements 
mediated by DSBs. In essence, the model proposed is that 
SRSF1 is ^transcriptionally loaded onto the nascent pre- 
mRNA via the phosphorylated CTD of Pol II to not only 
promote splicing but also prevent R-loop formation and 
subsequent genomic instability (Aguilera 2005; Li and 
Manley 2005). This is a clear example of the connections 
between transcription-induced R loops, pre-mRNA pro- 
cessing, the Pol II CTD, and transcription-associated geno- 
mic instability. 

The cotranscriptional R-loop-induced genomic instabil- 
ity observed in DT40 cells with the loss of SRSF1 factor 
(Li and Manley 2005, 2006) resembles the phenotype of 
yeast THO mutants (Huertas and Aguilera 2003). Until 
recently, the involvement of the human THO/TREX 
complex in genomic instability was not clearly identified. 
A study from the Aguilera laboratory (Dominguez-Sanchez 
et al. 2011), however, revealed that the interplay between 
mRNP biogenesis and genomic instability is indeed con- 
served from yeast to humans. In essence, depletion of the 
human THO complex results in accumulated DNA breaks 
and consequent genomic instability, which is dependent on 
R-loop formation (Dominguez-Sanchez et al. 2011). In 
addition to this, genome-wide data suggest that RNA 
processing factors help prevent genomic instability (Paulsen 
et al. 2009), further pointing toward a powerful interplay 
between pre-mRNA processing and R-loop-dependent ge- 
nomic instability. 



Factors that remove R loops 

As mentioned above, apart from the active prevention 
of R loops, cells also use a range of dedicated factors to 
actively remove them once formed. First of all, the RNase 
H enzymes act to cleave the RNA of RNA/DNA hybrids 
(Stein and Hausen 1969,- Hausen and Stein 1970; Cerritelli 
and Crouch 2009). In most organisms, there are two types 
of RNase H. Eukaryotic RNase HI consists of a single 
polypeptide, with the N-terminal domain being respon- 
sible for binding to the RNA/DNA hybrid (hybrid-binding 
domain or HBD), and the CTD containing the RNase H 
active site (Cerritelli and Crouch 2009). RNase H2 is 
composed of three different polypeptides, with the RNase 
2 A being the catalytic subunit. RNase HI and RNase H2 
endonucleolytically cleave the RNA within the RNA/ 
DNA hybrid in a sequence-independent manner. How- 
ever, they may have different in vivo substrates due to 
their differences in hybrid hydrolysis. 

RNase HI is present in the nucleus and mitochondria 
and is essential for mitochondrial replication (Cerritelli 
et al. 2003). Overexpression of nuclear RNase HI (Cerritelli 
et al. 2003) has been widely used to experimentally 
remove R loops and so far is perhaps the only well- 
studied and efficient way to diminish the cellular 
levels of R loops. RNase H2 is not as well defined due 
to its multisubunit composition and low abundance. It 
is believed to be mostly a repair enzyme due to its 
substrate specificity. Unlike RNase HI, RNase H2 can 
recognize and cleave a single ribonucleotide inserted in 
a DNA duplex, and therefore it was suggested that RNase 
H2 removes ribonucleotides misincorporated into DNA 
(Eder et al. 1993; for review, see Cerritelli and Crouch 
2009). RNase H2 is also responsible for removing the 
Okazaki fragment RNA primers from the newly synthe- 
sized lagging strand during DNA replication (Murante 
et al. 1998; for review, see Cerritelli and Crouch 2009). 
Recently, another very interesting function of RNase H2 
was shown: It can uniquely process R loops that are 
generated during DNA replication/repair (Chon et al. 
2013). It was also suggested in this same study that RNase 
HI and RNase H2 have some overlapping specificities in 
R-loop resolution,- however, RNase HI is mainly respon- 
sible for the resolution of transcription-associated R loops 
(Chon et al. 2013). 

Second, the yeast Senl, a superfamily I RNA/DNA heli- 
case (Kim et al. 1999), acts to remove R loops and pre- 
vent genomic instability by R-loop-mediated DNA dam- 
age (Mischo et al. 2011). R loops accumulate in a senl-1 
strain that carries a mutation in the helicase domain of 
Senl. Furthermore, SEN1 genetically interacts with genes 
involved in homologous recombination (HR). Senl is also 
a termination factor for coding and noncoding genes 
(Ursic et al. 1997; Steinmetz et al. 2006; Kawauchi et al. 
2008). Senl, Nrdl, and Nab3 proteins comprise the NRD 
complex, the major factor in promoting sno/snRNA ter- 
mination (Ursic et al. 1997; Steinmetz et al. 2001). Senl 
binds to the CTD phosphorylated on Ser2 (Ursic et al. 
2004,- Chinchilla et al. 2012), which possibly facilitates its 
recruitment to multiple coding as well as noncoding genes, 
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where it tends to accumulate toward the 3' end (Chinchilla 
et al. 2012). Senl may also play a role in coordinating 
transcription and replication, since it is associated with 
replication forks across Pol II active genes (Alzu et al. 

2012) . 

Senataxin is the human homolog of Senl and has also 
been implicated in transcriptional termination (Suraweera 
et al. 2009; Skourti-Stathaki et al. 2011; Padmanabhan 
et al. 2012). Senataxin was initially identified when mu- 
tations causing ataxia oculomotor apraxia 2 (AOA2) and 
amyotrophic lateral sclerosis type 4 (ALS4) were mapped 
to the SETX gene (James and Talbot 2006; Palau and 
Espinos 2006). AOA2 mutations include both missense 
and nonsense mutations leading to senataxin loss of func- 
tion, whereas mutations linked to ALS4 appear to be 
missense, dominant mutations resulting in gain of func- 
tion (Chen at al. 2004; Arning et al. 2013). These diseases 
are associated with the progressive degeneration of motor 
neurons in the brain and spinal cord, progressive muscle 
weakness, and atrophy. SETX encodes a 302.8-kD widely 
expressed protein containing an N-terminal putative 
protein-protein interaction domain and a C-terminal 
DEAD-box helicase domain followed by a nuclear local- 
ization signal (NLS) (Chen et al. 2004). Most senataxin 
mutations found in AOA2/ALS4 families either cause 
premature translational termination or interfere with the 
function of the helicase or N-terminal protein interaction 
domains (Chen et al. 2004; Moreira et al. 2004; Duquette 
et al. 2005; Criscuolo et al. 2006; Fogel and Perlman 2006). 
However, the precise mechanism of toxicity caused by 
these mutations and manifested in AOA2/ALS4 patients 
remains to be elucidated. Similar to its yeast counterpart, 
senataxin interacts with Pol II and other RNA processing 
factors, such as poly(A)-binding proteins 1 and 2 (PABP1/2), 
hnRNPs, SAP155, and SMN, pointing toward a role for 
senataxin in pre-mRNA processing as well as transcrip- 
tional termination (Suraweera et al. 2009). 

Increasing evidence identifies senataxin as a DNA re- 
pair enzyme (Becherel at al. 2013; Yiice and West 2013) in 
addition to its role in the resolution of R loops arising at 
G-rich termination pause sites (Skourti-Stathaki et al. 
201 1). Yiice and West (2013) revealed that senataxin forms 
increased nuclear foci in S/G2 phase in response to DNA 
damage and impaired DNA replication. Importantly, 
these foci decreased significantly after R-loop resolution 
or transcriptional inhibition (Yuce and West 2013). The 
role of senataxin in DNA damage response, particularly 
during mouse male meiosis (spermatogenesis), has been 
highlighted by analysis of mouse strains with SETX 
gene knockouts. This study led to a model in which 
senataxin is proposed to resolve R loops to ensure genome 
integrity during meiotic recombination (Becherel et al. 

2013) . THO mutants from C. elegans and S. cerevisiae 
also show defective meiosis and increased DNA damage 
(Castellano-Pozo et al. 2012), pointing toward an evolu- 
tionarily conserved mechanism to maintain genome sta- 
bility during meiosis by preventing formation of these 
potentially harmful structures. 

It is of note that defects in DNA repair enzymes 
are strongly linked with neurodegenerative disorders 



(McKinnon 2009). In the future, it will be of vital impor- 
tance to understand why defects in senataxin, a ubiqui- 
tously expressed protein, particularly affect neuronal cells. 
AOA2 disorder manifests primarily in post-replication 
neurons where DNA repair could rely mostly on non- 
homologous end-joining (NHEJ) rather than HR. This 
could possibly increase genomic instability in these neu- 
rons. A recent study suggests a novel role for the exosome 
in senataxin-mediated DNA damage response (Richard et al. 
2013). In essence, the exosome interacts with sumoylated 
senataxin, which is in turn targeted to DNA damage regions. 
Importantly, both sumoyaltion and the interaction are 
disrupted in AOA2, but not ALS4, disorder. In agreement, 
it has also been shown that deletion of TRF4 in S. cerevisiae, 
a component of the TRAMP complex that activates the 
exosome, leads to accumulation of R loops and genomic 
instability (Gavalda et al. 2013). Significantly, the exosome 
also associates with AID in the R-loop-enriched switch 
regions of B cells (Basu et al. 2011). The physiological role 
of the exosome-senataxin interaction has yet to be estab- 
lished. Future research is necessary to understand why and 
how this mechanism is disrupted in AOA2 disorder. 

Rad51 and trans-induced R loops 

So far, we described here what is known about the factors 
that prevent or resolve R loops. However, do some factors 
actively promote R-loop formation? One particular factor 
has been shown to promote in vivo R-loop formation in 
S. cerevisiae: the Rad51 protein (see Fig. 3; Wahba et al. 
2013). Eukaryotic Rad51 protein that is homologous to 
the bacterial RecA plays a major role in HR during DNA 
repair of DSBs. Its loading on "damaged" DNA is thought 
to be stimulated by the ssDNA protein RPA. Rad5 1 then 
promotes strand exchange (invasion of ssDNA into du- 
plex DNA) by forming nucleoprotein filaments (Benson 
et al. 1994; Baumann et al. 1996). The bacterial RecA was 
also shown to promote RNA/DNA hybrid formation in 
vitro (Kasahara et al. 2000; Zaitsev and Kowalczykowski 
2000). The Koshland laboratory (Wahba et al. 2013) de- 
monstrated that in S. cerevisiae, deletion of Rad5 1 results 
in reduced formation of R loops and subsequent genomic 
instability, especially when R loops are enhanced by 
inactivation of RNA processing activities. Furthermore, 
Rad5 1 colocalizes with R loops prior to formation of any 
DSBs. Paradoxically, Rad51 is not only a repair factor but 
also promotes R-loop-mediated DNA damage and geno- 
mic instability. This is particularly important in cancer 
cells where Rad51 could directly promote tumor bio- 
genesis. Two regulators of Rad51 have been described: 
Rad52, which is required for binding of Rad51 to ssDNA, 
and Srs2, which acts as an antagonist and consequently 
prevents R-loop formation (Wahba at al. 2013). Therefore, 
Srs2 can be added to the list of proteins that prevent 
R-loop formation and subsequent induced genomic 
instability. 

Interestingly, in the absence of Rad51, transcription 
itself fails to lead to accumulation of R loops and genomic 
instability (Wahba et al. 2013). A yeast strain was gen- 
erated with two copies of a particular human DNA 
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Figure 3. Rad51 can promote cis and trans R loops. The HR 
factor Rad5 1 can promote strand exchange, ultimately leading to 
cotranscriptional R-loop formation (cis R-loop). Trans R loops can 
also be mediated by Rad51. As shown in the diagram, trans RNA 
may target the ssDNA as part of a pre-existing R loop. Alterna- 
tively, trans RNA could target dsDNA if local unwinding of the 
DNA duplex occurs by mechanisms such as DNA replication. 
Such trans R loops are associated with the popular CRISPR-Cas9 
system in which CRISPR guide RNA hybridizes with target DNA 
loci generating targeted DNA breaks. Trans R loops can also occur 
between ncRNAs and homologous DNA, ultimately leading to 
transcriptional gene silencing. DNA is shown as gray and black 
lines, and RNA is shown as a red line. The diagram is not drawn to 
scale. 



sequence: one in a yeast artificial chromosome (YAC) and 
the other in one of yeast's own chromosomes but under 
induced transcription regulation. When transcription was 
activated, the induced transcript invaded the homologous 
DNA in the YAC, resulting in the formation of an R loop 
in trans (away from the initial transcription point). The 
formation of this trans R loop was promoted by Rad51 
(Wahba et al. 2013). This finding challenges for the first 
time the dogma in the field that R loops form only 
^transcriptionally (in cis). It also raises important ques- 
tions of R-loop-induced genomic instability. Tr <ms-induced 
R loops could be a bigger threat to genome integrity than 
those formed in cis. Upon transcription of highly repetitive 
elements, in cis R loops would only occur at that region, 
whereas in trans-induced R loops could occur in many 
places across the genome, creating multiple "hot spots" for 
genomic instability. 

Trans-induced R loops have also recently been shown 
to enable a broad range of applications, including fast 
generation of genetically modified cells and animals 
(Gratz et al. 2013; Hwang et al. 2013; Wang et al. 2013; 
Yang et al. 2013) and genetic screening at a genomic level 
(Shalem et al. 2014; Wang et al. 2014). The so-called 
CRISPR (clustered regularly interspaced short palindromic 
repeat)-Cas9 system is a naturally occurring microbial 
immune system for protection against phage and other 
genetic elements, (for review, see Terns and Terns 2011). 
Short DNA fragments from infecting phage genomes are 
incorporated into the host genome within the CRISPR 
locus. Transcription of this CRISPR locus then gives rise to 
CRISPR RNA, which in turn is processed into short RNAs 
consisting of phage sequence and repeat elements from the 
CRISPR locus. These small RNAs complexed with Cas9 
act as guides to target the homologous DNA locus, cre- 
ating a trans R loop. Cas9 then cuts the target locus on 
each strand and ultimately silences the target DNA (Jinek 
et al. 2012). Very recently, the crystal structure of Cas9 in 



complex with guide RNA and target DNA was reported, 
revealing the key functional interactions, including a bona 
fide RNA/DNA hybrid (Nishimasu et al. 2014). By analogy 
to CRISPR-Cas9, it is tempting to speculate that trans R 
loops could explain how noncoding RNAs (ncRNAs) 
generally target homologous DNA, which is an underly- 
ing principle of transcriptional gene silencing by the RNAi 
machinery. 

Even though the precise targeting of CRISPR guide RNA 
to target loci (Jinek et al. 2012) has made this system very 
popular for genome editing, little is known about the actual 
molecular mechanism. Given that the target DNA is in 
a duplex, how does guide RNA hybridize with DNA? 
Does Cas9 itself or another factor first denature the DNA 
to allow RNA/DNA hybridization? An alternative, in- 
teresting scenario could be that cis R loops must form 
prior to CRISPR targeting and so enable hybridization of 
the now single-stranded target DNA with the CRISPR 
guide RNA. In this case, targeting would be dependent on 
the formation of cis R loops and consequently on active 
transcription. 

So far, we discussed here R loops as precursors of chro- 
mosomal rearrangements in yeast and mammalian cells 
with deleterious consequences to cell integrity. But is the 
formation of R loops always unprogrammed and poten- 
tially harmful? 

The new era of R loops: from threats of genomic 
instability to powerful regulators of gene expression 

Over the last decade and particularly the last 3 years, a 
new era has emerged for the R-loop field, identifying these 
structures as powerful regulators of gene expression. So far, 
we documented the unprogrammed formation of R loops 
as a rare outcome of the transcriptional process with 
potential harmful consequences — in effect, an enemy to 
cellular fitness (Aguilera and Garcia-Muse 2012). How- 
ever, the "other side of the coin" is more positive. An 
ever-increasing body of evidence has shed light on a num- 
ber of biological processes controlled by the programmed 
formation of R loops. 

The first beneficial function of R loops to be uncovered 
was CSR at the Ig heavy chain locus in activated B cells 
(Yu et al. 2003), as mentioned above. This study is of 
particular importance, as it firstly demonstrated the in 
vivo formation of R loops over switch regions that 
undergo CSR and revealed that these R-loop structures 
facilitate CSR only under specific conditions. R loops 
forming at the CSR locus are quite long (>1 kb) and very 
stable, as opposed to R loops at replication origins. The 
length of the observed stable R loop led to the view that 
the DNA sequence itself might play a vital role in R-loop 
formation and stabilization. Indeed, these switch regions 
are highly repetitive, GC-rich regions. Formation of 
transcription-dependent R loops over these regions leaves 
the G-rich nontemplate DNA strand displaced. Based on 
an analysis using bisulfite treatment to target ssDNA, it 
was also shown that stable R loops occur only in the 
physiological orientation (with G-rich transcripts) (Yu 
et al. 2003). 
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Controlling gene expression requires the definition of 
gene boundaries: the 5' end promoter and 3' end termina- 
tor. R loops have been recently shown to form at both 
gene ends (see Fig. 4; Skourti-Stathaki et al. 2011; Ginno 
et al. 2012, 2013). However, do they control gene expres- 
sion in these cases? The Chedin laboratory (Ginno et al. 
2012, 2013) has recently presented evidence for wide- 
spread R-loop formation over the 5' regions downstream 
from CpG promoters in the human genome. These genomic 
regions are GC-rich and have a strong positive GC skew 
(template strand having an excess of C vs. G residues). Even 
though a direct association of R loops in the maintenance of 
the unmethylated state of CpG promoters has not been 
established, this study suggested that R loops forming at 
promoter regions could potentially lead to the activation 
of genes by recruiting either the protective histone 3 Lys4 
trimethylation (H3K4me3) mark or the DNA demeth- 
ylation complex (Ginno et al. 2012). Consistent with this 
model, the AID complex is also found at H3K4me3- 
enriched promoter-proximal chromatin, including CpG 
islands (Yamane et al. 201 1). The key factor here could be 
the displaced ssDNA in the R loops, which may directly 
recruit histone methyltransferases or DNA demeth- 
ylases. Paradoxically, the ssDNA is also suggested to link 
R loops and DNA damage events. How cells sense these 
two opposite situations and act accordingly remains an 
important but still unanswered question. 

R loops are directly implicated in transcriptional termi- 
nation of some human genes by studies from our labora- 
tory (Skourti-Stathaki et al. 2011). R loops formed over 
G-rich termination regions facilitate Pol II pausing down- 
stream from the poly(A) signal prior to transcriptional 
termination. However, in this situation, a very fine bal- 
ance of R loops is required for efficient termination. Once 
formed, these R loops must then be resolved by the 
helicase senataxin to release the nascent RNA and so 
allow its Xrn2-mediated degradation, which ultimately 
leads to efficient Pol II transcriptional termination 
(Skourti-Stathaki et al. 201 1 ). This is a situation in which, 
even in the same gene, R loops can have a dual role: They 
are required for efficient termination, but their accumu- 
lation (followed by senataxin knockdown) inhibits this 
process. In the future, it will be interesting to investigate 
how, within one gene, cells can prevent the deleterious 
effects of R loops (resolved by senataxin) but at the same 
time allow their positive function (in this case, efficient 
termination). 

Another interesting observation further supports the 
connection of R loops with Pol II termination. Genome- 
wide analysis has previously revealed that G-rich se- 
quences immediately downstream from the poly(A) signal 
are relatively common in mammalian genes (Salisbury 
et al. 2006). Recent genome- wide bioinformatic analysis 
has also shown that promoters and 3' regions of genes 
are enriched in G4-forming sequences (Huppert et al. 
2008). Interestingly, 3' untranslated region (UTR) G4s 
are particularly prevalent in cases in which a second gene 
is placed in close proximity, suggesting that G4s may 
be involved in transcriptional termination (Huppert 
et al. 2008). However, it still remains to be established 
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Figure 4. R loops are enriched at both gene ends. In human 
protein-coding genes, R loops form over unmethylated CpG 
island promoters with positive GC skew and G-rich termination 
regions. Promoter-enriched R loops could activate gene expres- 
sion, whereas terminator-enriched R loops promote transcrip- 
tional termination by facilitating Pol II pausing downstream 
from the poly(A) signal. Transcription start site (TSS), transcrip- 
tion termination site (TTS), and poly (A) (pA) signal are shown. 
Colored shading indicates peaks of R loops over 5' and 3' gene 
ends. The diagram is not drawn to scale. 



whether genes that form R loops at their G-rich 3' ends 
or at CpG island promoters also form G4 structures. 
Finally, a second R-loop genomic analysis strikingly 
suggested that a subset of Pol II terminators with a posi- 
tive GC skew corresponds to R-loop regions genome-wide 
(Ginno et al. 2013). As in G4s (Huppert et al. 2008), genes 
with R loops at their 3' ends are located in gene-dense 
regions, further reinforcing the role of R loops in efficient 
termination. 

Altogether, we suggest that R loops are a common 
feature of G-rich pause terminator elements in human 
genes (Skourti-Stathaki et al. 2011; Ginno et al. 2013). 
How do they promote termination? Stopping Pol II is not 
an easy task. Once in a processive elongation mode, Pol II 
elongates at 4.3 kb/min (70 bp/sec) (Darzacq et al. 2007) 
over a diverse sequence landscape that may extend to 
>1 Mb in vertebrates. Also, given the fact that Pol II is 
a very large protein, one could argue that such transient 
structures such as R loops are inadequate to anchor Pol II. 
Identifying the molecular mechanism by which R loops 
promote termination is likely to provide new insights 
into the regulation of gene expression at the level of 
transcription. 

R loops and ncRNAs 

R loops have also been shown recently to play a role in the 
regulation of ncRNA (see Fig. 5; Powell et al. 2013; Sun 
et al. 2013). In Arabidopsis thaliana, COOLAIR is the 
antisense long ncRNA (IncRNA) that regulates the expres- 
sion of the FLC gene, a key repressor of flower de- 
velopment. Upon prolonged cold conditions, COOLAIR 
becomes transcriptionally active and represses FLC tran- 
scription. Until recently, the transcriptional regulation of 
COOLAIR itself remained uncharacterized. However, 
compelling new evidence on the role of R loops in this 
process recently came to light (Sun et al. 2013). In essence, 
R loops are shown to form over the promoter region of 
COOLAIR, and a ssDNA-binding homeodomain protein, 
AtNDX, binds and stabilizes these R loops. This ultimately 
leads to COOLAIR transcriptional repression (Sun et al. 
2013). This study provides a clear example of how regula- 
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Figure 5. R loops transcriptionally regulate ncRNAs. [A] In 
plants, COOL AIR antisense IncRNA controls the expression of 
the FLC gene. R loops form over the promoter region of 
COOLAIR and are stabilized by the ssDNA-binding protein 
AtNDX. This causes transcriptional repression of COOLAIR 
and, ultimately, activation of the FLC gene. (5) In human 
neuronal cells, topoisomerase inhibitor topotecan causes accu- 
mulation of R loops in the G-rich termination region of the 
Snordll6 gene. This causes chromatin decondensation and 
blocks read-through transcription that otherwise forms the 
Ube3a antisense transcript. This activates the expression of 
the Ube3a sense transcript. Arrows indicate the direction of 
transcription. For simplicity, nucleosomes are omitted. The 
diagram is not drawn to scale. 

tory IncRNAs can themselves be regulated but also raises 
intriguing questions. How does AtNDX maintain R-loop 
formation in the presence of surveillance mechanisms? 
Does it act faster than helicases and RNase H enzymes, 
or do these cells somehow protect R loops and ensure 
COOLAIR repression and proper flowering patterns? 
A comparison with Rad51 may be relevant, as this 
protein also binds ssDNA and promotes R-loop forma- 
tion in vivo in S. cerevisiae (Wahba et al. 2013). In this 
case, however, Rad51 promotes "unprogrammed" R-loop 
formation and potential genomic instability. Given the 
fact that both AtDNX and Rad51 recognize unpaired 
ssDNA derived from an R loop, how does Rad51 on the 
one hand restrict potentially "deleterious" R loops, while 
AtNDX promotes "regulatory" R loops? Regardless of 
the exact mechanism, it is clear that discovering how 
AtDNX and Rad5 1 regulation occurs will provide a pow- 
erful new tool to understand how cells distinguish 
between the two "types" of R loops. 

R loops have recently been linked with the molecular 
mechanism of a cancer drug, topotecan, that reactivates 
the expression of the imprinted silenced gene Ube3a 
(Powell et al. 2013). Angelman syndrome (AS) is an 
autism-related disorder that is caused by mutations or 
deletions of the maternal copy of the Ube3a gene (Kishino 
et al. 1997; Matsuura et al. 1997). Normally, neurons 
express only the maternal copy of this gene and silence 
the paternal copy via the Ube3a antisense transcript. So, 
Ube3a mutations in the maternal copy result in a com- 



plete loss of the protein, a brain-specific ubiquitin E3 
ligase. Ube3a antisense is located immediately down- 
stream from the Snordll6 gene, mutations of which 
cause a second disorder, Prader-Willi syndrome. The 
cancer drug topotecan was found to reactivate the 
paternal copy of Ube3a by reducing the antisense Ube3a 
transcript in neurons and therefore could be potentially 
used to treat AS (Huang et al. 2011). Even though 
topotecan holds promise for AS treatment, it still re- 
mains unknown how it targets specifically Ube3a and 
no other genes within this locus. Importantly, topotecan 
is an inhibitor of topoisomerase, which, as mentioned 
above, relaxes negative supercoiling. It is now revealed 
that R-loop formation plays a role in the topotecan effect 
(Powell et al. 2013). In essence, R loops form over the 
G-rich Snordll6 gene, which in turn causes nucleosome 
depletion and chromatin decondensation in the paternal 
allele. 

Under physiological conditions, Ube3a antisense tran- 
scription silences Ube3a in cis. Upon topotecan treat- 
ment, these R loops are stabilized and so accumulate. 
According to this model, this R-loop accumulation causes 
excessive chromatin decondensation, stalling of the tran- 
scriptional machinery, and inhibition of Ube3a antisense 
expression. This in turn activates paternal Ube3a expres- 
sion (Powell et al. 2013). This suggests that topotecan can 
also be used as a powerful regulator of R-loop formation in 
other contexts and so provides a new tool in the charac- 
terization of R-loop biology. 

Two important points arise from this study: First, in 
this case, R loops are shown to induce nucleosome 
depletion and chromatin decondensation, although it is 
not clear how this occurs. It could also be argued that 
chromatin decondensation/nucleosome depletion facili- 
tates R-loop formation. Cause cannot easily be distin- 
guished from the effect in this case as in other R-loop- 
associated processes. As mentioned above, the accumu- 
lation of R loops can also induce chromatin condensation 
(Castellano-Pozo et al. 2013) in contrast to the Snordll6 
gene. Even though, in these two studies, R-loop forma- 
tion may have opposite outcomes on chromatin struc- 
ture, it is tempting to speculate that a more general link 
exists between R loops and chromatin dynamics. Second, 
topotecan is the first R-loop targeting drug used as 
a therapeutic agent in genetic disorders. Camptothecin 
(CPT), another topoisomerase inhibitor that is used as an 
anti-cancer drug, has also been shown to stabilize R loops. 
Interestingly, CPT causes accumulation of Pol II anti- 
sense transcripts over CpG island promoters (Marinello 
et al. 2013). It is evident that, more than ever, research on 
R loops is vital to understand how these structures could 
be disrupted in cancer and other diseases. This would also 
strengthen the likelihood that R-loop formation is tightly 
controlled, as its dysregulation and/or accumulation can 
compromise genome dynamics and function. 

Finally, R loops formed over the centromeric repeats in 
S. pombe have been shown to mediate RNAi-dependent 
heterochromatin formation (Nakama et al. 2012). This 
study is of particular interest, since it shows that R loops 
are potentially involved in silencing centromeric DNA. 
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Heterochromatic ncRNA have been suggested to remain 
on chromatin and function as a binding platform for the 
RNAi apparatus (Cam et al. 2009). Following this study, 
two important questions remain unanswered. How do 
ncRNAs remain bound to chromatin? Do RNAi factors 
(especially the RITS complex) target the gene transcript 
or the DNA strand? Nakama et al. (2012) suggested that 
ncRNA transcribed from heterochromatin remains bound 
to chromatin via the formation of an R loop and that the 
so-formed RNA/DNA hybrid itself is involved in the 
heterochromatin formation. In essence, RNA/DNA hybrid 
foci colocalize with centromeric heterochromatic regions, 
and overexpression of RNase H decreases these foci, which 
in turn disrupts heterochromatin formation. Given that, 
upon R-loop formation, the single-strand nontemplate 
DNA remains unpaired, Nakama et al. (2012) predicted 
that the RITS complex could target the single-stranded 
"unprotected" DNA or, alternatively, the chromatin-asso- 
ciated RNA to generate heterochromatin formation. 

Even though this study did not discuss whether R-loop 
formation is the trigger of heterochromatin formation 
rather than its consequence, these observations again 
tightly connected the fields of transcription and hetero- 
chromatin, raising intriguing questions for further investi- 
gation. Do R loops facilitate transcriptional gene silencing 
in other regions of the genome? Is this phenomenon 
conserved in mammalian cells? Could R-loop "hot spots" 
be regions of RNAi-dependent heterochromatin assembly? 
In any case, this study suggested that R loops control an 
epigenetic mark directly or indirectly. Future studies on 
this potential connection will shed light on the "RNA- 
guided pathway for the epigenome" (fenuwein 2002) and 
might place R loops as key players of the critical interre- 
lationship between transcription and chromatin. 

,R loops and neurodegenerative disorders 

The mechanistic connection between R loops and neu- 
rodegenerative disorders remains unclear even though 
examples and associations are growing. Apart from the 
connections presented for the Snordll6 study, R loops are 
often associated with neurodegenerative disease caused 
by abnormal expansion of repeated DNA sequences (the 
so-called repeat expansion disorders) (Lin et al. 2010; 
Mclvor et al. 2010; Reddy et al. 2011; Wongsurawat 
et al. 2012). Very recently, R loops were shown to form 
over the promoter of the fragile X mental retardation 1 
[Fmrl ) gene and coincide with its epigenetic silencing in 
fragile X syndrome (Colak et al. 2014). A similar mech- 
anism has also recently been shown to occur in another 
trinucleotide repeat expansion disease, Friedriech's ataxia 
(Groh et al. 2014). In this study, expanded disease alleles 
were shown to accumulate R loops, resulting in a tran- 
scriptional block and heterochromatin formation. G4 
structures were also found to form in a hexanucleotide 
repeat expansion of the C9orf72 gene, which causes ALS 
and frontotemporal dementia (FTD). G4 in C9orf72 DNA 
promotes the formation of stable R loops, which in turn 
impedes transcriptional elongation and leads to produc- 
tion of short, abortive transcripts (Haeusler et al. 2014). 



Furthermore, it was recently shown that topoiso- 
merases promote transcription of human and mouse long 
genes linked to autism (King et al. 2013). Interestingly, 
Snordll6-Ube3a antisense is an extremely long tran- 
scription unit, implying that topotecan might reduce the 
expression of other long genes. Indeed, topotecan has also 
been shown to reduce the expression of other long genes 
in human and mouse neurons in a dose-dependent 
manner (King et al. 2013). Transcription of very long 
genes has also been shown to cause replication/transcrip- 
tion collisions and accumulation of R loops at the CFSs 
(Helmrich et al. 2011). Some of these genes have been 
shown to be down-regulated in neurological diseases, 
such as Alzheimer's disease (Sze et al. 2004). Additionally, 
defects in DNA repair proteins (which, as mentioned 
above, are connected to R loops) cause neurodegenerative 
syndromes (McKinnon 2009). Defective DNA repair in 
mature neuronal tissues has also been linked to aging and 
neurodegenerative disorders, such as Parkinson's disease 
and Alzheimer's disease. Senataxin has recently been 
suggested to act as a DNA repair protein (Yuce and West 
2013) by resolving R loops at human genes (Skourti- 
Stathaki et al. 2011). Given that mutations in senataxin 
cause specific neurodegenerative disorders (James and 
Talbot 2006; Palau and Espinos 2006), perhaps senataxin, 
despite being a ubiquitously expressed protein, has a spe- 
cial role in neuronal genes by controlling the transcription 
of some fragile sites present in long genes. Altogether, 
these studies reveal a complex and coordinated network in 
neurons between transcription, DNA repair, and R loops. 
Indeed, the general physiological relevance of R loops as 
transcriptional regulators seems more and more likely. 

Conclusions and perspectives 

The last decade has seen a significant expansion of our 
knowledge of R-loop biology and function. For years, R 
loops were considered a threat to cells as a rare transcrip- 
tional by-product (Aguilera and Garcia-Muse 2012). It is 
only now that we start to realize that they may have 
a major regulatory role in gene function. They can be the 
"two sides of a coin," deleterious structures but also fine- 
tuners of gene expression. Given their involvement in 
multiple cellular processes, understanding how cells pre- 
vent the negative functions of R loops yet allow their 
positive ones is a challenge for the years to come. Perhaps 
the key to this question is the unpaired ssDNA derived 
from these structures. It can be the trigger for genomic 
instability but can also provide base-pairing for trans 
RNAs or act as a binding scaffold for enzymes that 
control the transcription cycle. 

A great deal has been learned in recent years about 
factors that prevent or resolve R loops. Research should 
now aim to discover more factors (in addition to Rad51 
and AtNDX) that actively promote R-loop formation. Is 
there an evolutionarily conserved protein that is gener- 
ally responsible for R-loop formation? Answering this 
fundamental question will perhaps allow us to better 
understand the dual functions of R loops and also link R 
loops to hitherto unanticipated cellular processes. 
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As mentioned above, R loops are thought to play a role 
in neurodegenerative disorders even though strong evi- 
dence for this association has yet to be established. R loops 
could offer a novel angle on regulation of transcription, and 
it is now the time to unravel their possible links with 
cancer and neurodegenerative disease. From the examples 
mentioned in this review, it is evident that R loops lie at 
the interphase of different fields: transcription, RNA 
processing, DNA damage, and chromatin. More than ever, 
we need to interconnect these fields to fully understand 
how R loops modulate genome dynamics. 
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